Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We present a user-centric validation of a teleneurology platform, assessing its effectiveness in conveying screening information, facilitating user queries, and offering resources to enhance user empowerment. This validation process is implemented in the setting of Parkinson's disease (PD), in collaboration with a neurology department of a major medical center in the USA. Our intention is that with this platform, anyone globally with a webcam and microphone-equipped computer can carry out a series of speech, motor, and facial mimicry tasks. Our validation method demonstrates to users a mock PD risk assessment and provides access to relevant resources, including a chatbot driven by GPT, locations of local neurologists, and actionable and scientifically-backed PD prevention and management recommendations. We share findings from 91 participants (48 with PD, 43 without) aimed at evaluating the user experience and collecting feedback. Our framework was rated positively by 80.85% (standard deviation ± 8.92%) of the participants, and it achieved an above-average 70.42 (standard deviation ± 13.85) System-Usability-Scale (SUS) score. We also conducted a thematic analysis of open-ended feedback to further inform our future work. When given the option to ask any questions to the chatbot, participants typically asked for information about neurologists, screening results, and the community support group. We also provide a roadmap of how the knowledge generated in this paper can be generalized to screening frameworks for other diseases through designing appropriate recording environments, appropriate tasks, and tailored user-interfaces.more » « less
-
There has been a rise in automated technologies to screen potential job applicants through affective signals captured from video-based interviews. These tools can make the interview process scalable and objective, but they often provide little to no information of how the machine learning model is making crucial decisions that impacts the livelihood of thousands of people. We built an ensemble model – by combining Multiple-Instance-Learning and Language-Modeling based models – that can predict whether an interviewee should be hired or not. Using both model-specific and model-agnostic interpretation techniques, we can decipher the most informative time-segments and features driving the model's decision making. Our analysis also shows that our models are significantly impacted by the beginning and ending portions of the video. Our model achieves 75.3% accuracy in predicting whether an interviewee should be hired on the ETS Job Interview dataset. Our approach can be extended to interpret other video-based affective computing tasks like analyzing sentiment, measuring credibility, or coaching individuals to collaborate more effectively in a team.more » « less
-
Recent Transformer-based contextual word representations, including BERT and XLNet, have shown state-of-the-art performance in multiple disciplines within NLP. Fine-tuning the trained contextual models on task-specific datasets has been the key to achieving superior performance downstream. While fine-tuning these pre-trained models is straightforward for lexical applications (applications with only language modality), it is not trivial for multimodal language (a growing area in NLP focused on modeling face-to-face communication). More specifically, this is due to the fact that pre-trained models don’t have the necessary components to accept two extra modalities of vision and acoustic. In this paper, we proposed an attachment to BERT and XLNet called Multimodal Adaptation Gate (MAG). MAG allows BERT and XLNet to accept multimodal nonverbal data during fine-tuning. It does so by generating a shift to internal representation of BERT and XLNet; a shift that is conditioned on the visual and acoustic modalities. In our experiments, we study the commonly used CMU-MOSI and CMU-MOSEI datasets for multimodal sentiment analysis. Fine-tuning MAG-BERT and MAG-XLNet significantly boosts the sentiment analysis performance over previous baselines as well as language-only fine-tuning of BERT and XLNet. On the CMU-MOSI dataset, MAG-XLNet achieves human-level multimodal sentiment analysis performance for the first time in the NLP community.more » « less
An official website of the United States government

Full Text Available